trybuild2 1.2.0

Test harness for ui tests of compiler diagnostics (with support for inline tests)
Documentation

Trybuild2

trybuild2 is a fork of trybuild which allows to have inline tests.

Trybuild is a test harness for invoking rustc on a set of test cases and asserting that any resulting error messages are the ones intended.

Such tests are commonly useful for testing error reporting involving procedural macros. We would write test cases triggering either errors detected by the macro or errors detected by the Rust compiler in the resulting expanded code, and compare against the expected errors to ensure that they remain user-friendly.

This style of testing is sometimes called ui tests because they test aspects of the user's interaction with a library outside of what would be covered by ordinary API tests.

Nothing here is specific to macros; trybuild2 would work equally well for testing misuse of non-macro APIs.

[dev-dependencies]
trybuild2 = "1.0"

Compiler support: requires rustc 1.45+

Compile-fail tests

A minimal trybuild setup looks like this:

#[test]
fn ui() {
    let t = trybuild2::TestCases::new();
    t.compile_fail("tests/ui/*.rs");
}

The test can be run with cargo test. It will individually compile each of the source files matching the glob pattern, expect them to fail to compile, and assert that the compiler's error message matches an adjacently named *.stderr file containing the expected output (same file name as the test except with a different extension). If it matches, the test case is considered to succeed.

Dependencies listed under [dev-dependencies] in the project's Cargo.toml are accessible from within the test cases.

Failing tests display the expected vs actual compiler output inline.

A compile_fail test that fails to fail to compile is also a failure.

To test just one source file, use:

cargo test -- ui trybuild2=example.rs

where ui is the name of the #[test] funtion that invokes trybuild2, and example.rs is the name of the file to test.

Pass tests

The same test harness is able to run tests that are expected to pass, too. Ordinarily you would just have Cargo run such tests directly, but being able to combine modes like this could be useful for workshops in which participants work through test cases enabling one at a time. Trybuild was originally developed for my procedural macros workshop at Rust Latam.

#[test]
fn ui() {
    let t = trybuild2::TestCases::new();
    t.pass("tests/01-parse-header.rs");
    t.pass("tests/02-parse-body.rs");
    t.compile_fail("tests/03-expand-four-errors.rs");
    t.pass("tests/04-paste-ident.rs");
    t.pass("tests/05-repeat-section.rs");
    //t.pass("tests/06-make-work-in-function.rs");
    //t.pass("tests/07-init-array.rs");
    //t.compile_fail("tests/08-ident-span.rs");
}

Pass tests are considered to succeed if they compile successfully and have a main function that does not panic when the compiled binary is executed.

Details

That's the entire API.

Workflow

There are two ways to update the *.stderr files as you iterate on your test cases or your library; handwriting them is not recommended.

First, if a test case is being run as compile_fail but a corresponding *.stderr file does not exist, the test runner will save the actual compiler output with the right filename into a directory called wip within the directory containing Cargo.toml. So you can update these files by deleting them, running cargo test, and moving all the files from wip into your testcase directory.

Alternatively, run cargo test with the environment variable TRYBUILD2=overwrite to skip the wip directory and write all compiler output directly in place. You'll want to check git diff afterward to be sure the compiler's output is what you had in mind.

What to test

When it comes to compile-fail tests, write tests for anything for which you care to find out when there are changes in the user-facing compiler output. As a negative example, please don't write compile-fail tests simply calling all of your public APIs with arguments of the wrong type; there would be no benefit.

A common use would be for testing specific targeted error messages emitted by a procedural macro. For example the derive macro from the ref-cast crate is required to be placed on a type that has either #[repr(C)] or #[repr(transparent)] in order for the expansion to be free of undefined behavior, which it enforces at compile time:

error: RefCast trait requires #[repr(C)] or #[repr(transparent)]
 --> $DIR/missing-repr.rs:3:10
  |
3 | #[derive(RefCast)]
  |          ^^^^^^^

Macros that consume helper attributes will want to check that unrecognized content within those attributes is properly indicated to the caller. Is the error message correctly placed under the erroneous tokens, not on a useless call_site span?

error: unknown serde field attribute `qqq`
 --> $DIR/unknown-attribute.rs:5:13
  |
5 |     #[serde(qqq = "...")]
  |             ^^^

Declarative macros can benefit from compile-fail tests too. The json! macro from serde_json is just a great big macro_rules macro but makes an effort to have error messages from broken JSON in the input always appear on the most appropriate token:

error: no rules expected the token `,`
 --> $DIR/double-comma.rs:4:38
  |
4 |     println!("{}", json!({ "k": null,, }));
  |                                      ^ no rules expected this token in macro call

Sometimes we may have a macro that expands successfully but we count on it to trigger particular compiler errors at some point beyond macro expansion. For example the readonly crate introduces struct fields that are public but readable only, even if the caller has a &mut reference to the surrounding struct. If someone writes to a readonly field, we need to be sure that it wouldn't compile:

error[E0594]: cannot assign to data in a `&` reference
  --> $DIR/write-a-readonly.rs:17:26
   |
17 |     println!("{}", s.n); s.n += 1;
   |                          ^^^^^^^^ cannot assign

In all of these cases, the compiler's output can change because our crate or one of our dependencies broke something, or as a consequence of changes in the Rust compiler. Both are good reasons to have well conceived compile-fail tests. If we refactor and mistakenly cause an error that used to be correct to now no longer be emitted or be emitted in the wrong place, that is important for a test suite to catch. If the compiler changes something that makes error messages that we care about substantially worse, it is also important to catch and report as a compiler issue.

License